AITopics | nonparametric contextual bandit

Collaborating Authors

nonparametric contextual bandit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing SystemsDec-25-2025, 21:01:07 GMT

metric space, name change, nonparametric contextual bandit, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback

Tracking Most Significant Shifts in Nonparametric Contextual Bandits

Neural Information Processing SystemsDec-23-2025, 23:51:47 GMT

We study nonparametric contextual bandits where Lipschitz mean reward functions may change over time.We first establish the minimax dynamic regret rate in this less understood setting in terms of number of changes $L$ and total-variation $V$, both capturing all changes in distribution over context space, and argue that state-of-the-art procedures are suboptimal in this setting.Next, we tend to the question of an _adaptivity_ for this setting, i.e. achieving the minimax rate without knowledge of $L$ or $V$. Quite importantly, we posit that the bandit problem, viewed locally at a given context $X_t$, should not be affected by reward changes in other parts of context space $\cal X$. We therefore propose a notion of _change_, which we term _experienced significant shifts_, that better accounts for locality, and thus counts considerably less changes than $L$ and $V$. Furthermore, similar to recent work on non-stationary MAB (Suk & Kpotufe, 2022), _experienced significant shifts_ only count the most _significant_ changes in mean rewards, e.g., severe best-arm changes relevant to observed contexts.Our main result is to show that this more tolerant notion of change can in fact be adapted to.

name change, nonparametric contextual bandit, significant shift, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.63)

Add feedback

Reviews: Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing SystemsJan-26-2025, 10:05:50 GMT

The paper is clear, and makes efforts to highlight the behavior of the proposed algorithm (value of the regret bound for some specific settings, experiments measuring the impact of the metric). The comparison to other settings may still be enforced. In the experimental part, I would also appreciate the comparison to include some state of the art algorithms. What would be the empirical results of a gaussian process-based bandit? It would also be interesting to have results on datasets used by other contextual/similarity-based bandits (except that these datasets use a context in R d). Finally, it's surprising to have a context space of dimension 1. Extending the algorithm to R d setting seems strait-forward.

metric space, nonparametric contextual bandit, unknown metric, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.43)

Add feedback

Reviews: Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing SystemsJan-26-2025, 10:05:39 GMT

This paper proposed a nonparametric approach to contextual bandits that can adapt to unknown simple structure between the arms. The reviewers found the paper to be novel in how it combined existing ideas to solve an interesting problem. Additionally, the reviewers found the paper to be clearly written and of potential significance for the problem they tacked. Some minor concerns were raised, however, the authors seem to have addressed them in their response. The authors should incorporate any suggested edits by the reviewers as well as the promised updates in the author response.

metric space, nonparametric contextual bandit, unknown metric, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)

Add feedback

Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing SystemsOct-10-2024, 17:29:40 GMT

Consider a nonparametric contextual multi-arm bandit problem where each arm a \in [K] is associated to a nonparametric reward function f_a: [0,1] \to \mathbb{R} mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g. We present a novel algorithm which learns data-driven similarities amongst the arms, in order to implement adaptive partitioning of the context-arm space for more efficient learning. We provide regret bounds along with simulations that highlight the algorithm's dependence on the local geometry of the reward functions.

metric space, nonparametric contextual bandit, unknown metric, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.47)

Add feedback

Tracking Most Significant Shifts in Nonparametric Contextual Bandits

Neural Information Processing SystemsOct-9-2024, 21:54:01 GMT

We study nonparametric contextual bandits where Lipschitz mean reward functions may change over time.We first establish the minimax dynamic regret rate in this less understood setting in terms of number of changes L and total-variation V, both capturing all changes in distribution over context space, and argue that state-of-the-art procedures are suboptimal in this setting.Next, we tend to the question of an _adaptivity_ for this setting, i.e. achieving the minimax rate without knowledge of L or V . Quite importantly, we posit that the bandit problem, viewed locally at a given context X_t, should not be affected by reward changes in other parts of context space \cal X . We therefore propose a notion of _change_, which we term _experienced significant shifts_, that better accounts for locality, and thus counts considerably less changes than L and V . Furthermore, similar to recent work on non-stationary MAB (Suk & Kpotufe, 2022), _experienced significant shifts_ only count the most _significant_ changes in mean rewards, e.g., severe best-arm changes relevant to observed contexts.Our main result is to show that this more tolerant notion of change can in fact be adapted to.

kpotufe, nonparametric contextual bandit, significant shift

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback

A Nonparametric Contextual Bandit with Arm-level Eligibility Control for Customer Service Routing

Wen, Ruofeng, Zeng, Wenjun, Liu, Yi

arXiv.org Artificial IntelligenceSep-8-2022

Amazon Customer Service provides real-time support for millions of customer contacts every year. While bot-resolver helps automate some traffic, we still see high demand for human agents, also called subject matter experts (SMEs). Customers outreach with questions in different domains (return policy, device troubleshooting, etc.). Depending on their training, not all SMEs are eligible to handle all contacts. Routing contacts to eligible SMEs turns out to be a non-trivial problem because SMEs' domain eligibility is subject to training quality and can change over time. To optimally recommend SMEs while simultaneously learning the true eligibility status, we propose to formulate the routing problem with a nonparametric contextual bandit algorithm (K-Boot) plus an eligibility control (EC) algorithm. K-Boot models reward with a kernel smoother on similar past samples selected by $k$-NN, and Bootstrap Thompson Sampling for exploration. EC filters arms (SMEs) by the initially system-claimed eligibility and dynamically validates the reliability of this information. The proposed K-Boot is a general bandit algorithm, and EC is applicable to other bandits. Our simulation studies show that K-Boot performs on par with state-of-the-art Bandit models, and EC boosts K-Boot performance when stochastic eligibility signal exists.

data mining, eligibility score, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2209.05278

Country: North America > United States (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Retail (0.69)
Information Technology > Services (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.87)

Add feedback

Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Wanigasekara, Nirandika, Yu, Christina

Neural Information Processing SystemsMar-19-2020, 02:46:13 GMT

Consider a nonparametric contextual multi-arm bandit problem where each arm $a \in [K]$ is associated to a nonparametric reward function $f_a: [0,1] \to \mathbb{R}$ mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g. We present a novel algorithm which learns data-driven similarities amongst the arms, in order to implement adaptive partitioning of the context-arm space for more efficient learning. We provide regret bounds along with simulations that highlight the algorithm's dependence on the local geometry of the reward functions. Papers published at the Neural Information Processing Systems Conference.

nonparametric contextual bandit, reward function, unknown metric, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.47)

Add feedback